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Abstract 



This survey, aimed at information processing re- 
searchers, highhghts intriguing but lesser known re- 
sults, corrects misconceptions, and suggests research ar- 
eas. Themes include: certainty in quantum algorithms; 
the "fewer worlds" theory of quantum mechanics; quan- 
tum learning; probability theory versus quantum me- 
chanics. 



This idiosyncratic survey delves into areas of quantum 
information processing of interest to researchers in fields 
like information retrieval, machine learning, and artificial 
intelligence. It overviews intriguing but lesser known re- 
sults, corrects common misconceptions, and suggests re- 
search directions. Three types of applications of a quantum 
viewpoint on information processing are discussed: quan- 
tum algorithms and protocols; quantum proofs for classical 
results; the use of formalisms developed for quantum me- 
chanics in other areas with linear algebraic or probabilistic 
components. This paper is not tutorial in nature; readers 
new to the field should read it in conjunction with a tutorial 
([Rieffel & Polak 20 00|) or boo k (Nielsen & Chuang 200 1[ 
[Rieffel & Polak in preparation| ) on the subject. 

A number of themes underlie this paper: certainty in 
quantum algorithms and quantum mechanics, including a 
"fewer worlds" correction to popular conceptions of the 
"many worlds" interpretation of quantum mechanics; re- 
lations and distinct differences between probability theory 
and quantum mechanics, including how entanglement dif- 
fers from coiTelation; what is known and what remains un- 
certain as to the source of the power of quantum informa- 
tion processing. The most startling thing about quantum 
mechanics is not that it is probabilistic, but rather that it dis- 
obeys fundamental laws of probability theory. A common 
framework encompassing both probability theory and quan- 
tum mechanics throws light on many of these themes. The 
most technical parts of the paper establish this framework 
and discuss its implications. 
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What is and isn't quantum information 
processing 

Quantum information processing includes quantum compu- 
tation and cryptographic and communication protocols like 
quantum key distribution and dense coding. Quantum com- 
putation is not synonymous with using quantum effects in 
computation; quantum mechanical effects are used in the 
processors of all state of the art (classical) computers. The 
distinction between classical and quantum computation is 
whether the information being processed is encoded in a 
classical or quantum way, in bits or qubits. 

Certainty in quantum mechanics 

Non-probabilistic quantum algorithms 

Glaringly obvious - perhaps blindingly so - examples of non- 
probabilistic quantum algorithms exist: quantum analogs of 
classical non-probabilistic algorithms. Any reversible clas- 
sical computation has a directly analogous quantum com- 
putation. Any classical computation has a reversible coun- 
terpart using at most 0(i^+^) time and 0{s\ogt) space 
dBennett I9891 I. If the initial classical algorithm is non- 
probabilistic, so are the analogous reversible and quantum 
algorithms. 

More surprising perhaps is that the first truly quantum al- 
gorithms - ones that do not have classical counterparts - suc- 
ceed with certainty. The quantum algorithm for Deutsch's 
problem (I Deutsch 1985t IDeutsch & Jozsa I9921 l succeeds 
with certainty. Grover's search algorithm is not inherently 
probabilistic. His initial algorithm succeeded only with 
high probability (IGrover 1997i ), but with a little cleverness 
Grover's algorithm can be modified so that it is guaranteed 
to find an element being searched for while still preserv- 
ing the quadratic speed up. ( Brassard, H0yer, & Tapp I998j) 
suggest two approaches. In essence, the first rotates by a 
slightly smaller angle at each step, while the second changes 
only the last step to a smaller rotation. Shor's factoring al- 
gorithm is inherently probabilistic just like many of the best 
classical algorithms for related problems like primality test- 
ing. 

Fewer worlds theory of quantum mechanics 

Many papers discuss the pros and cons of the many worlds 
theory. Here we mean to correct not that theory, but the 



popular conception of it as "everything happens in some 
universe". Popular accounts of quantum mechanics, and 
some scholarly articles, give the impression that quantum 
mechanics, at least in the many worlds interpretation, im- 
plies that everything happens in some universe. A typical 
quote jPeutsch 1998l l: "There are even universes in which a 
given object in our universe has no counterpart - including 
universes in which I was never born and you wrote this ar- 
ticle instead." The variety of imaginative examples suggest 
that anything we can conceive of, even the highly unlikely, 
happen, if only in a small number of universes. But much of 
the surprise of quantum mechanics is that certain things we 
thought would happen, even things we thought were sure to 
happen, do not happen at all. 

Most startling are events that were predicted to happen 
with certainty by classical physics, but which in fact hap- 
pen with probability 0. Thus, not only is it not true that ev- 
erything we can conceive of is predicted to happen in some 
universe, but things we can hardly conceive of not happen- 
ing do not happen, not in any universe. To emphasize this 
correction, I call it "the fewer worlds than we might think" 
interpretation of quantum mechanics, or the "fewer worlds" 
theory for short. 

Here are a few examples. In the double slit experi- 
ment, quantum mechanics predicts that no light reaches 
certain spots. And indeed no light reaches those 
spots, even though classically we expect some pho- 
tons to reach every spot. Even more striking is the 
GHZ experiment ( Greenberger, Horne, & Zeilinger 1989 



Greenberger ef a/. 1990 IPan et al. 2000 ^ in which the clas 
sical prediction is that each of four things happen with equal 
probability and another four things never happen. Quantum 
mechanics predicts, and experiments confirm, that the four 
outcomes that are classically predicted to happen never hap- 
pen (and the four classically prohibited outcomes do occur, 
with equal probability). As a final example, we saw that 
many quantum algorithms return a result with probability 1; 
the obvious conclusion is that the other results do not happen 
at all. 

Uncertainty in classical physics 

Both relativity and uncertainty principles exist in purely 
classical settings. The revolutions of the 20*'* century, 
special and general relativity and quantum mechanics, ex- 
panded on these principles. In special relativity, Einstein 
made Galilean relativity - the notion that the speed of an 
object depends on the observer and is not a property of 
the object itself - compatible with the notion of a constant 
speed of light, the same for all observers. Quantum mechan- 
ics took standard classical uncertainty principles involving 
waves and applied them to particles with the implication that 
nothing of a pure particle nature exists, in this way resolving 
various experimental and theoretical issues. 

That a particle cannot simultaneously have both a pre- 
cisely defined position and a precisely defined momentum is 
the startling content of Heisenberg's uncertainty principle. 
This statement is less surprising when applied to a wave. 
Uncertainty principles for classical waves are well known. 
For example, consider a signal s{t) with a finite mean i and 



standard deviation At. Similarly assume the mean uj and 
standard deviation Acj of s(t)'s frequency distribution can 
be calculated. Classical signals s{i) obey the uncertainty 
principle AiAcj > 1/2. That a signal with small standard 
deviation in time cannot have too small a standard devia- 
tion in its frequency spectrum is not mysterious. Details can 
be found in many signal processing books; jCohen 19951 1 is 
particularly detailed and insightful. 

This discussion makes no mention of measurement 
(though it certainly has implications for measurement). 
Contrary to popular belief, Heisenberg's uncertainty prin- 
ciple is not about imprecision in our ability to measure 
(though it has implications for measurement). Just like 
time/frequency in the signal case, Heisenberg's uncertainty 
principle says that a particle cannot have definite values for 
both its position and momentum. The implication is that 
there are no classical point particles, with position and mo- 
mentum both precisely defined; there aren't even arbitrary 
close approximations to such. The implications of this prin- 
ciple for measurement is that even in an ideal case, in which 
measurement of a series of particles in identical states were 
performed perfectly, if the standard deviation of the results 
for position measurements is small enough then the stan- 
dard deviation of the results for momentum must be propor- 
tionally large. Initially Heisenberg and others confused two 
arguments, one based on the wave nature of particles, the 
other based on a disturbance theory of measurement. It is 
the former that has stood the test of time. The failure of a 
disturbance theory was established by the famous EPR paper 
( Einstein, Podolsky, & Rosen 1935| (though it took decades 
before a fuller understanding of the implications of the EPR 
paradox was achieved by Bell). 

Generalized uncertainty principles exist for many other 
pairs of properties. For example, an uncertainty relation for 
polarization says that if a particle has polarization close to 
horizontal or vertical it cannot have polarization close to 
45°. This uncertainty principle is more intuitive than that 
for position and momentum, but the mathematics is closely 
related. 

Applications of a quantum viewpoint to 
information processing 

There exist three distinct classes of applications of the view- 
point that has developed from the study of quantum infor- 
mation processing. The first and most obvious class con- 
tains quantum algorithms and protocols. The second is the 
use of reasoning about quantum systems to obtain insight 
into classical computer science. The third class consists of 
purely classical results inspired by the formalisms developed 
to deal with quantum information processing and quantum 
mechanics more generally. We briefly discuss this last class 
of applications, and then devote a section to each of the first 
two classes. 

Researchers in quantum mechanics, responding to their 
need to delve deeply and carefully into the linear algebra 
and generalized probability theory underlying quantum me- 
chanics, have developed powerful formalisms for discussing 
these areas. Dirac's compact and suggestive bra/ket nota- 



tion is useful for any work involving significant linear alge- 
bra. The operator view gives insight into classical probabil- 
ity theory, and understanding the tensor structure inherent in 
classical probability theory and its difference from a direct 
sum structure helps clarify many issues including relation- 
ships between joint distributions and their marginals. 

Implications of reasoning about quantum systems 
to problems in classical computer science 

We give two surprising, elegant examples. 

Cryptographic protocols usually rely on the empirical 
hardness of a problem for their security; it is rare to be able 
to prove complete, information theoretic security. When a 
cryptographic protocol is designed based on a new prob- 
lem, the difficulty of the problem must be established be- 
fore the security of the protocol can be understood. Empir- 
ical testing of a problem takes a long time. Instead, when- 
ever possible, "reduction" proofs are given that show that 
if the new problem were solved it would imply a solution 
to a known hard problem; the proofs show that the solution 
to the known problem can be reduced to a solution of the 
new problem. (Reg ev 2005| l designed a novel, purely clas- 
sical cryptographic system based on a certain problem. He 
was able to reduce a known hard problem to this problem, 
but only by using a quantum step as part of the reduction 
proof. Thus he has shown that if the new problem is ef- 
ficiently solvable in any way, there is an efficient quantum 
algorithm for the old problem. But it says nothing about 
whether there would be a classical algorithm. This result is 
of practical importance; his new cryptographic algorithm is 
a more efficient lattice based public key encryption system. 
Lattice based systems are currently the leading candidate for 
public key systems secure against quantum attacks. 

More spectacular, if less practical, is Aaronson's new so- 
lution to a notorious conjecture involving a purely classi- 
cal complexity class PP ( Aaronson 2005b). From 1972 un- 
til 1995 this question remained open. Aaronson defines a 
new quantum complexity class PostBQP, an extension of 
the standard quantum complexity class BQP, motivated by 
the use of postselection in certain quantum arguments. It 
takes him a page to show that PostBQP=PP, and then only 
three lines to prove the conjecture. Thus it seems that for 
certain questions, the "right" way to view the classical class 
PP is through the eyes of quantum information processing. 

Quantum algorithms and protocols 

Shor's factoring and discrete log algorithms solve impor- 
tant but narrow problems. Grover's algorithm and its gen- 
eralizations are applicable only to a more restricted class of 
problems than many people outside the field realize. For 
example, it is unfortunate that Grover used "database" in 
the title of dGrover 19971 1 since his algorithm does not ap- 
ply to what most people mean by a database. Grover's 
algorithm only gives a speed-up over unstructured search, 
and databases, which are generally highly structured, can 
be searched extremely rapidly classically. At best quan- 
tum computation can only give a constant factor improve- 
ment for searches of ordered data like that of databases 
dCiuTds, Landahl, & Parrilo 2006| l. 



Even worse, obtaining output from Grover's algorithm de- 
stroys the quantum superposition, and recreating the super- 
position is often linear in N which negates the 0{^/N) ben- 
efit of the search algorithm. For this reason Grover's algo- 
rithm and its generalizations are only applicable to searches 
over data that has a sufficiently uniform and quick gener- 
ating function which can be used to quickly compute the 
superposition. 

Finding new quantum algorithms has been exceed- 
ingly slow going. Some more recent algorithms 
include ( Hallg ren 2002| l for solving Pell's equa- 
tions, (Watrous 2001) for the group black box model, 
( van Dam, Hallgren, & Ip 2003^ for the shifted Legendre 
symbol problem. The first two are closely related to Shor's 
algorithm - they are in the class of hidden subgroup prob- 
lems - and the third makes heavy use of Fourier transforms. 
In the past five years a new family of quantum algorithms 
has been discovered that uses techniques of quantum walks 
to solve a variety of problems, some related to graphs, 
others to matrix products or commutativity in groups 
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For many years Shor's algorithm and Grover's algorithm 
were viewed as widely different algorithms. Quantum 
learning theory ( Bshouty & Jackson 1999^ Servedio 2001i 
IGortler & Servedio 20041 IHunziker et al. 20031 

lAtici & Servedio 2005l l is closely tied to both. Quan- 
tum learning descends from computational learning theory, 
a subfield of artificial intelligence. Computational learning 
is concerned with concept learning. Common models 
include exact learning and probably approximately correct 
(PAC) learning. A concept is modeled by its membership 
as given by a Boolean function c : {0,1}" {0,1}. Let 
C = {ci} be a concept class. Say one has access to an 
oracle Oc for one of the concepts c in C, but one doesn't 
know which. The types of oracles assumed vary, but a 
common one is a membership oracle which upon input 
of X outputs c{x). In the quantum case, one can input 
superpositions of inputs to obtain superpositions of outputs. 
One can ask a variety of questions as to how quickly and 
with how many queries to the oracle can the concept c be 
determined. Sample results in this area include the negative 
result that the number of classical and quantum queries 
required for any concept class does not differ by more than 
a polynomial in either the exact or PAC model. However the 
story is different if computational efficiency is taken into 
account. In the exact model the existence of any classical 
one-way function guarantees the existence of a concept 
class which is polynomial-time learnable in the quantum 
case but not in the classical. For the PAC model a slightly 
weaker result is known in terms of a particular one-way 
function. 

Probability theory and quantum mechanics 

To quote Scott Aaronson ( lAaronson 2005al l: 

"To describe a state of n particles, we need to write down 
an exponentially long vector of exponentially small num- 



bers, which themselves vary continuously. Moreover, the 
instant we measure a particle, we "collapse" the vector that 
describes its state - and not only that, but possibly the state of 
another particle on the opposite side of the universe. Quick, 
what theory have I just described?" 

"The answer is classical probability theory. The moral is 
that, before we throw up our hands over the "extravagance" 
of the quantum worldview, we ought to ask: is it so much 
more extravagant than the classical probabilistic worldview? 
After all, both involve linear transformations of exponen- 
tially long vectors that are not directly observable." 

We spend the next section putting this view of prob- 
ability theory on a firm basis. We then describe how 
quantum mechanics is a formal extension of probability 
theory. We only sketchily describe this extension; more 
details can be found in dStrocchi 2005 ; Kuperberg 2005 
IRedei & Su mmers 20061 [Kitaev, Shen, & Vyalyi 2002 



Sudbery 1986 ; Mackey 1963| l. 



Many, but not all, of the unintuitive aspects of quantum 
mechanics exist in classical probability theory. Entangle- 
ment does not exist in classical probability, but classical cor- 
relations are strange enough, judging by human reaction to 
many of them. 

A view of classical probability theory 

Let A be a set of n elements. A probability distribution n on 
A is a function 

such that X^aeA ^ ^- space V"^ of all probability 
distributions over A has dimension n — 1. We can view 
as the n — 1 dimensional simplex (Tn-i ~ {x ^ R"|xi > 
Q^xi + X2 + ■ ■ ■ + Xn = 1} which is contained in the n 
dimensional space R'^, the space of all functions from A to 
R, 

R^ = {/ : ^ ^ R}. 

For n = 2, the simplex (t„_i is the line segment from (1,0) 
to (0, 1). The vertices of the simplex correspond to the ele- 
ments a E A: a probability distribution /i maps to the point 
in the simplex x = (/i(ai), fJ,{a2), ■ ■ ■ , /x(a„)). 

Let _B be a set of m elements. Let A x B he the Carte- 
sian product Ax B ^ {{a, b)\a e A,b e B}. What is the 
relation between ^ ^, the space of all probability distribu- 
tions over Ax B, and the spaces V"^ and V^l The tempting 
guess is not correct: p^^^ ^ x . We see this re- 
lation does not hold by checking dimensions. First consider 
the relationship between R"^^^ and R'^ and R^. Since 
^ X i? has cardinality \ Ax B\ ^ \A\\B\ = nm, R^^-^ has 
dimension nm, which is not equal to n + m, the dimension 
of R^ X R^. Since in general dini{V^) = dim(R^) - 1, 
dim{V^^^) = nm — 1 which is not equal to n + m — 2, 
the dimension of x , so -p^x^ ^V^x . Instead 
R^x^ is the tensor product R^ ® R^ of R^ and R^. So 

Tensor products are rarely mentioned in probability text- 
books, but the tensor product is as much a part of proba- 
bility theory as of quantum mechanics. The tensor product 
structure inherent in probability theory should be stressed 
more often; one of the sources of mistaken intuition about 



probabilities is a tendency to try to impose the more fa- 
miliar direct product structure on what is actually a ten- 
sor product structure. We briefly review tensor prod- 
ucts here; readers not familiar with tensor products should 
consult more extensive expositions dRieffel & Polak 20001 



Nielsen & Chuang 2001 Rieffel & Polak in preparation i. 

The tensor product V ® W of two vector spaces V 
and W with bases A — {ai, a2, . . . , a„} and B = 
{bi, b2, . . . , b„j} respectively is an nm-dimensional vector 
space with basis a,; ® hj where ® is the tensor product, an 
abstract binary operator defined by the following relations: 

(Vl + V2 ) ® X = Vl (8) X + V2 ® X 
V (g) (Xl + X2) V ® Xl + V (g) X2 

(av) (K) X — V (g) (ax) = a v (g) x. 
Taking k — min(n, to), all elements ofV®X have form 



Vi ® Wi + V2 QS) W2 ■ 



Vfe ® Wfe. 



Due to the relations defining the tensor product such a rep- 
resentation is not unique. Furthermore, most elements of 
y (g) 14^ cannot be written as w (g u; where v €V and w €W . 

Let ^0 = {Oo, lo}, Ai = {Oi, li}, and A2 = {O2, 12}, 
where Iq versus Oq corresponds to whether or not the next 
person you meet is interested in quantum mechanics, Ai to 
whether they know the solution to the Monty Hall problem, 
and A2 to whether they are at least 5'6" tall. So IoIqOo cor- 
responds to someone under 5'6" who is interested in quan- 
tum mechanics and knows the solution to the Monty Hall 
problem. We often write 110 instead of IqIoOo; the sub- 
scripts are implied by the position. A probability distribu- 
tion over the set of eight possibilities, Aq x Ai x A2, has 
form 

P = (pooo, Pool, Poio, Poll, PlOO,Pl01,PllO, Pill)- 

More generally, a probability distribution over Aq x Ai x 
■ ■ ■ X Ak, where the Ai are all 2 element sets, is a vector of 
length 2^. We now understand the first part of Aaronson's 
remark: vectors in probability theory are exponentially long. 

Given functions f : A H and g : B H, define the 
tensor product / g) g : A x _B R by (a, 6) 1— > f{a)g{b). 
If /i and v are probability distributions, then so is /i g) v. 
The linear combination of distributions is a distribution as 
long as the linear coefficients are non-negative and sum to 1 . 
Conversely, any distribution rj e 'P^^^ is a linear combi- 
nation of distributions of the form ji® v with linear factors 
summing to 1. 

A joint distribution /i e -pAxB independent or uncor- 
related if it can be written as a tensor product /^^i g) /^s of 
distributions G and /ig e . The vast majority of 
joint distributions do not have this form, in which case they 
are correlated. For any joint distribution /i e pAxB ^ 
define a marginal distribution G by 



^jLA-a^'^ ^J.{a,b). 



beB 



An uncorrected distribution is the tensor product of its 
marginals. Other distributions cannot be reconstructed from 
their marginals; information has been lost. 



A distribution /i on a finite set A that is concentrated en- 
tirely at one element is said to be a pure; on a set Aof n ele- 
ments there are exactly n pure distributions jia '■ A ^ [0,1], 
one for each element of A, where 



1 if a' = a 
otherwise. 



All other distributions are said to be mixed. 

Let us return to the example of the traits for the next per- 
son you meet. Unless you know all of these traits, the distri- 
bution p — {pooo, . . . ,piii) is a mixed distribution. When 
you meet the person you can observe their traits. Once you 
have made these observations, the distribution "collapses" 
to a pure distribution. For example, if the person is inter- 
ested in quantum mechanics, does not know the solution to 
the Monty Hall problem, and is 5'8", the "collapsed" distri- 
bution is = (0,0,0,0,0,1,0,0). 

To understand the final part of Aaronson's remark, con- 
sider another example. Say someone prepares two sealed 
envelopes with identical pieces of paper and sends them 
to opposite sides of the universe. Half the time both en- 
velopes contain 0; half the time 1. The initial distribution 
is Pi — (1/2,0,0, 1/2). If someone then opens one of the 
envelopes and observes a 0, the state of the contents of the 
other envelope is immediately known - known faster than 
light can travel between the envelopes - and the distribution 
"collapses" to pTi = (1,0,0,0). 

Are we disturbed by the "extravagance" of the expo- 
nential state space of classical probability theory, and the 
"faster-than-light collapse" of these classical vectors un- 
der observation? Another question one might ask is: can 
this "extravagance" be used to facilitate computation? The 
answer is a resounding yes; allowing randomness does 
give additional computational power See {Rar el 19871 
ITraub & Werschulz 1999 ) for delightful expositions of the 
computational benefits of randomness. 

To fully understand the relationship between quantum 
mechanics and probability theory it is useful to view prob- 
ability distributions as operators. Consider the set of lin- 
ear operators M"^ — {M : R"^}. To every 
function f : A ^ H, there is an associated operator 
Mf : H.^ given by Mf : g i-^ fg. An operator 
M is said to be a projector if A/^ = M. The probability 
distributions /i whose corresponding operators A/^ are pro- 
jectors are exactly the pure distributions. The matrix for the 
operator corresponding to a function is always diagonal; for 
a probability distribution, diagonal and trace 1. For exam- 
ple, the operator corresponding to the probability distribu- 
tion Pi = (1/2, 0, 0, 1/2) is represented by the matrix 

/ 1/2 





\ 1/2 

Quantum mechanics as a generalization of 
probability theory 

The vector representation of a quantum state has redundancy 
that can be confusing; any vector multiplied by a unit length 



complex number e'^ - called the global phase - represents 
the same quantum state. Another way of representing quan- 
tum states removes this ambiguity and makes the relation 
with probability theory clearer We follow Dirac's elegant 
and compact bra/ket notation. The row vector {v\ is the con- 
jugate transpose of the column vector \v). For any A'^ dimen- 
sional vector \v) representing a quantum state we can con- 
struct a density operator, the N x N matrix |f ) The den- 
sity operator \v){v\ representing a quantum state no longer 
has ambiguity due to the global phase. Like the operators 
corresponding to probability distributions, the operators cor- 
responding to quantum states have trace 1 and are positive 
and Hermitian. Density operators corresponding to quan- 
tum states \v) are projectors so have rank 1. Unlike opera- 
tors for probability distributions, density operators need not 
be diagonal. For example, the density operator for the state 
|/) = l/x/2(|0) + |l))is 



l/>(/l = 



1/2 1/2 
1/2 1/2 



This example illustrates that superpositions are distinct from 
mixtures of basis states since such mixtures must be diago- 
nal: the fifty-fifty mixture of |0) and 1 1) has density operator 



i(|0)(0| + |l)(l|)= ( 




1/2 



The analog of taking the marginal is taking the partial 
trace. The partial trace trwOvw of an operator p : V (S) 
W ^ V IS) W with respect to the subsystem W is the oper- 
ator 



pv = trwO 



vw 



{bi\Ovw\bi, 



that acts on subsystem V, where {|6i)} is a orthonormal 
basis for W . Taking the partial trace of a density opera- 
tor produces another density operator, a Hermitian, positive, 
trace 1 operator Density operators obtained from the par- 
tial trace model what can be learned about a subsystem from 
measurements on that subsystem alone. In this context they 
are often called mixed states. Density operators of the form 
\v) {v\ are called pure states, or just quantum states. For ex- 
ample, the Bell state |$+) = 1/V2(|0)® |0) + |1)®|1)) = 
l/\/2(|00) + 1 11)) has density operator 



/ 1 1 





V 1 1 



and its partial trace with respect to either one of its qubits is 
the 2-dim density operator i/. 

Since every Hermitian operator can be diagonalized, ev- 
ery density operator p can be written as Pi !''/'«) (^i I' ^ 
probability distribution over pure quantum states where the 
I'i/'i) are mutually orthogonal eigenvectors of p, and pi are 
the eigenvalues. Conversely any probability distribution p 
over a set of orthogonal quantum states I'i/'i), | "02), IV'l) 
where p : [ifji) pi has a corresponding density opera- 
tor = 't2tPi\'^i){i'i\- In the basis {\ipi)}, the density 



operator is diagonal with entries pi, . . . ,pL. Under the 
isomorphism between and the subspace of V generated 
by l'02), • • ■ , IV'l)' the density operator realizes the 
operator il/^. Thus a probability distribution over a set of 
orthonormal quantum states {['(/'i)} can be viewed as a trace 
1 diagonal matrix acting on R^. 

Although every density operator can be viewed as a prob- 
ability distribution over a set of orthogonal quantum states, 
this representation is not in general unique. More impor- 
tantly, for most pairs of density operators pi and p2, there is 
no basis over which both pi and p2 are diagonal. In partic- 
ular, only if pi and p2 commute are they simultaneously di- 
agonalizable, so only in this case can they both be viewed as 
probability distributions over the same set of states. Thus, al- 
though each density operator of dimension N can be viewed 
as a probability distribution over states, the space of all 
density operators is much larger than the space of probabil- 
ity distributions over N states. Let p : V ^ V he a density 
operator. A density operator p corresponds to a pure state 
if and only if it is a projector This statement is analogous 
to that for probability distributions; the pure states corre- 
spond exactly to rank 1 density operators, and mixed states 
have rank greater than 1. Density operators are also used 
to model probability distributions over pure states, partic- 
ularly probability distributions over the possible outcomes 
of a measurement yet to be performed. Their use here is 
analogous to the classical use of probability distributions to 
model the probabilities of possible traits before they can be 
observed. 

A pure quantum state j?/;) is entangled if it cannot be writ- 
ten as the tensor product of single qubit states. For a mixed 
quantum state, it is important to determine if all of its corre- 
lation comes from being a mixture in the classical sense or if 
it is also correlated in a quantum fashion. A mixed quantum 
state p: ViS)W^V(S)W is said to be uncorrelated if 
p = Pv ® Pw for some density operators pv ■ V ^ V 
and Pw ■ W W. Otherwise p is said to be corre- 
lated. A mixed quantum state p is said to be separable if it 

can be written p = Y.'^j^^pM'^){'>PJ \ ® l^f )(0f | where 
IV']') e V and |7/>f ) e W. In other words, p IS separa- 
ble if all the correlation comes from its being a classical 
mixture of uncorrelated quantum states. If a mixed state 
p is not separable it is entangled. For example, the mixed 
state Pec = i(|00)(00|) + (|11)(11|) is classically corre- 
lated but not entangled whereas the Bell state |<I'+)($+| = 
i(|00) + |11))((00| + (11|) is entangled. The marginals of 
a pure distribution are always pure, but the analogous state- 
ment is not true for quantum states; all of the partial traces 
of a pure state are pure only if the original pure state was 
not entangled. As we saw, the partial traces of the Bell 
state 1$+), a pure state, are not pure. Most pure quantum 
states are entangled, exhibiting quantum correlations with 
no classical analog. All pure probability distributions are 
completely uncorrelated. 

Classical and quantum analogs: 



Classical probability 


Quantum mechanics 


probability distribution 
p viewed as operator 


density operator p 


pure dist: 
is a projector 


pure state: 
p is a projector 


marginal distribution 


partial trace 


A distribution is 
uncorrelated 
if it is the tensor product 
of its marginals 


A state is 
uncorrelated 
if it is the tensor product 
of its partial traces 



Key difference: 



Classical probability 


Quantum mechanics 


pure distributions are 
always uncorrelated 


pure states contain 
no classical correlation 
but can be entangled 



Where does the power of quantum 
information processing come from? 
Quantum parallelism? 

For any classical computation of a function f{x) on n bits, 
the analogous quantum computation U f produces a superpo- 
sition fi^)) of all input/output pairs upon input of 
a superposition of all input values. The ability of a quantum 
computer to obtain a superposition of all input/output pairs 
with similar effort as it takes a classical computer to obtain 
a single pair is called quantum parallelism. Since quantum 
parallelism enables one to work simultaneously with 2" val- 
ues, it in some sense circumvents the time/space trade-off of 
classical parallelism through its ability to hold exponentially 
many computed values in a linear amount of physical space. 
However, this effect is less powerful than it may initially ap- 
pear. 

We can gain only limited information from this superpo- 
sition: these 2" values of / are not independently accessi- 
ble. We only gain information by measuring, but measuring 
in the standard basis projects the final state onto a single in- 
put/output pair \x, f{x)), and a random one at that. By itself, 
quantum parallelism is useless. 

While N = 2" output values of f{x) appear in the sin- 
gle superposition state, it still takes = 2" computations 
of Uf to obtain them all, no better than the classical case. 
This limitation leaves open the possibility that quantum par- 
allelism can help in cases where only a single output, or 
a small number of outputs, is desired. It suggests an ex- 
ponential speed up, but such speed ups are rare. It has 
been proven that no quantum algorithm can improve on the 
0{V^) that Grover's algorithm achieves for unstructured 
search jBennett ef a/. 1997 ^. and for many other problems it 
has been proven that quantum computation cannot provide 
any speed-up jBeals et al. 2001IIAmbainis 2000l l. 

Exponential size of quantum state space? 

As we have seen, exponential spaces also arise in classical 
probability theory. Furthermore, what would it mean for an 
efficient algorithm to take advantage of the exponential size 
of a space? A superposition like ^ \x, f{x)) is only a 
single state of the quantum state space. The vast majority 



of states cannot even be approximated by an efficient quan- 
tum algorithm ( Knill 1995i ). An efficient quantum algorithm 
cannot even come close to most states in the state space. So 
quantum parallelism does not, and efficient quantum algo- 
rithms cannot, make use of the full state space. 

Quantum Fourier transforms? 

Most quantum algorithms use quantum Fourier transforms 
(QFTs). The Walsh-Hadamard transformation, a QFT over 
the group Z2, is frequently used to create a superposition 
of 2" input values. In addition the heart of most quan- 
tum algorithms makes use of QFTs. Shor and Grover use 
QFTs in both of these ways. Many researchers specu- 
lated that quantum Fourier transforms are the paramount 
quantum resource for quantum computation. So it came 
as a surprise when ( [Aharonov, Landau, & Makowsky 2006 1 
showed that the QFT is classically simulatable. Given the 
ubiquity of quantum Fourier transforms in quantum algo- 
rithms, researchers continue to consider QFTs as one of the 
main tools of quantum computation, but in themselves they 
are not sufficient. 

Entanglement? 

(IJozsa & Linden 2003] l show that any quantum algorithm in- 
volving only pure states that achieves exponential speed-up 
over classical algorithms must entangle a large number of 
qubits. While entanglement is necessary for an exponen- 
tial speed-up, the existence of entanglement is far from suf- 
ficient to guarantee a speed-up, and it may turn out that 
another property better characterizes what gives a speed- 
up. Many entangled systems have been shown to be clas- 
sically simulatable dVidal 20031 iMarkov & Shi 20051 1. Fur- 
thermore, if one looks at query complexity instead of algo- 
rithmic complexity, an exponential benefit can be obtained 
without any entanglement whatsoever ( Meyer 2000'i shows 
that in the course of the Bernstein- Vazirani algorithm, which 
achieves an to 1 reduction in the number of queries re- 
quired, no qubits become entangled. More obviously the 
BB84 quantum key distribution protocol makes no use of 
entanglement. 

For these reasons entanglement should not be viewed as 
the sole source of power in quantum information process- 
ing. However it is important in many contexts, and required 
in others. While researchers have long recognized entangle- 
ment as a uniquely quantum resource, much about entan- 
glement is poorly understood. Entanglement with respect 
to tensor decompositions of only two factors is completely 
characterized for pure states, and well studied for mixed 
states. See (Bruss 2002) for an introductory survey. But un- 
derstanding bi-partite entanglement is of limited utility for 
understanding quantum computation because there we are 
interested in entanglement between large numbers of qubits. 
Full characterization of entanglement with respect to tensor 
decompositions with many factors is difficult; where in the 
bi- or tri-partite cases only a finite number of parameters are 
needed, infinitely many parameters are required for four or 
more tensor factors (Diir, Vidal, & Cirac 2000| l. 

Instead of trying to fully characterize multipartite en- 
tanglement, we can ask which types of entanglement are 



useful, and for what. Significant progress has been made 
here, though much work remains. Cluster states were 
discovered to be a universal resource for quantum com- 
putation. In cluster state, or one-way, quantum comput- 
ing (Ra ussendorf, Browne, & Briegel 2003tlNielsen 2005) a 
highly entangled "cluster" states is set up at the beginning of 
the algorithm. All computations take place by single qubit 
measurements, so the entanglement between the qubits can 
only decrease in the course of the algorithm (the reason for 
the "one-way" name). The initial cluster state is indepen- 
dent of the algorithm to be performed; it depends only on 
the size of the problem to be solved. In this way cluster state 
quantum computation makes a clean separation between the 
entanglement creation and the computational stages. While 
the cluster state model clarifies somewhat the role of entan- 
glement in quantum computation, in another model, adia- 
batic quantum computation dAharonov ef a/. 2004), which 
like the cluster state model has been proved equivalent to the 
standard circuit model of quantum computation, the role of 
entanglement is obscure. Many intriguing questions as to the 
source of power in quantum information processing remain, 
and are likely to remain for many years while we humans 
struggle to understand what Nature allows us to compute 
quickly and why. 
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